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Abstract. The metabolic network of a living cell involves several hundreds 
or thousands of interconnected biochemical reactions. Previous research has 
shown that under realistic conditions only a fraction of these reactions is con- 
currently active in any given cell. This is partially determined by nutrient 
availability, but is also strongly dependent on the metabolic function and net- 
work structure. Here, we establish rigorous bounds showing that the fraction 
of active reactions is smaller (rather than larger) in metabolic networks evolved 
or engineered to optimize a specific metabolic task, and we show that this is 
largely determined by the presence of thcrmodynamically irreversible reactions 
in the network. We also show that the inactivation of a certain number of reac- 
tions determined by irreversibility can generate a cascade of secondary reaction 
inactivations that propagates through the network. The mathematical results 
are complemented with numerical simulations of the metabolic networks of the 
bacterium Escherichia coli and of human cells, which show, counterintuitively, 
that even the maximization of the total reaction flux in the network leads to a 
reduced number of active reactions. 

1. Introduction. The mathematical modeling of biological networks has focused 
on the influence of the network structure on the functional properties of the sys- 
tem [1, 2, 20]. Insights provided by these studies have shown, for example, that 
structural modules and hierarchical organization in the network are often related to 
compartmentalization of functional processes [21, 27]. This is important since in- 
tracellular processes are rarely carried out by individual elements, and often involve 
the coordinated activity of multiple genes, proteins, and biochemical transforma- 
tions. Because different components may be recruited for different processes, the 
most fundamental aspect of the dynamics of complex intracellular networks con- 
cerns precisely the characterization of the specific parts of the network that are 
active under given conditions. 



2000 Mathematics Subject Classification. Primary: 92C42; Secondary: 90C35. 
Key words and phrases, network modeling, cellular metabolism, duality principle, flux balance 
analysis, optimization. 



1 



2 



JOO SANG LEE, TAKASHI NISHIKAWA, ADILSON E. MOTTER 



Recent research focused on the modeling of metabolic networks has shown that 
typical metabolic states tend to recruit a much larger number of reactions than 
states that maximize growth rate [18]. This counterintuitive property is important 
in multiple contexts. For example, it provides a partial explanation for the apparent 
dispensability of a large fraction of genes in single-cell organisms [19, 5, 7], since the 
genes associated with reactions that become inactive in growth-maximizing states 
are expected not to be essential. This also explains why genetic and environmen- 
tal perturbations that cause growth defect are accompanied by a burst in reaction 
activity [9, 10]. These bursts can be attributed to the transient activation of other- 
wise inactive reactions that are recruited by the suboptimal states that follow the 
perturbation [18], since such states tend to have a larger number of active reactions. 
Another important implication concerns the possibility of synthetic rescues [17, 13], 
where the inactivation of one gene can be compensated by the targeted inactivation 
of other genes. The inactivation of such rescue genes can thus allow the recovery of 
lost biological function. This is possible in part because the rescue genes correspond 
to genes that would be inactive in an optimal state, so that disabling them helps 
bring the state of the system closer to the desired optimal state [16]. Understanding 
the root causes of the reduced reaction activity in optimal metabolic states is then 
of significant interest in the characterization and study of cellular metabolism. 

Focusing on steady-state dynamics, here we use flux-balance based analysis and 
linear programming techniques to establish rigorous results on the number of reac- 
tions that can be active in a given metabolic network. We derive conditions for a 
specific reaction to be inactive in all (Sec. 3) or active in almost all (Sec. 4) feasible 
metabolic states. We also establish bounds for the number of reactions that can be 
active in states that optimize an arbitrary linear function of reaction fluxes (Sec. 
5), which are derived based on the duality principle in linear programming, and we 
study the uniqueness of the optimal solution for typical linear objective functions 
(Sec. 6). Finally, we implement numerical simulations in reconstructed Escherichia 
coli and human metabolic networks, both to compare with the rigorous bounds 
and to consider nonlinear objective functions (Sec. 7). Taken together, our results 
show that the reduced number of active reactions in the optimal states of a linear 
objective function, including growth rate, arc mainly determined by the presence 
of irreversible reactions in the network. The irreversibility constraints are shown to 
play a role also in nonlinear objective functions, such as the aggregated flux and 
mass flow activity, whose optimal solutions are shown to have a number of active 
reactions comparable to the corresponding number for linear functions. 

2. Preliminary remarks. We consider time-independent metabolic states, which 
serve as an appropriate representation of the state of single cells at time scales much 
shorter than the lifetime of the cells, as well as of the average behavior of a large 
population of cells at arbitrary time scales in time-invariant conditions. Under this 
steady-state assumption, a cellular metabolic state is a solution of a homogeneous 
linear equation that accounts for all stoichiometric constraints, 

Sv = 0, (1) 

where S is the m x TV stoichiometric matrix and v £ M. N is the vector of metabolic 
fluxes. The components of v = (v\, . . . ,i>at) T include the fluxes of n internal and 
transport reactions as well as n ox exchange fluxes, which model the transport of 
metabolic species across the system boundary. Constraints of the form v% < ft 
imposed on the exchange fluxes are used to limit the maximum uptake rates of 
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substrates in the medium. Additional constraints of the form v% > arise for the 
reactions that are irreversible. Assuming that the cell's operation is mainly limited 
by the availability of substrates in the medium, we impose no other constraints on 
the internal reaction fluxes, except for the ATP maintenance flux, which is set to a 
fixed positive value. These additional constraints can be organized in the form 

ai<Vi<pi, i = l,...,N. (2) 

The set of all flux vectors v satisfying Eqs. (1) and (2) defines the feasible solution 
space M C M. N , representing the capability of the metabolic network as a system. 
Because the number of fluxes N is larger than the number of metabolic species to, 
system (1) is under-determined and M is generally high dimensional. 

Our study is formulated in the context of flux balance analysis [6, 29], which is 
based on the maximization of a metabolic objective function c T v within the feasible 
solution space M (the superscript T is used to denote transpose). This reduces to 
a linear programming problem 

N 

maximize: c T v = c,Uj 

1=1 (3) 
subject to: Sv = 0, v 6 R , 

Q-i < Vi < fa, i=l,...,N, 

where we set on = — oo if Ui is not bounded from below and = oo if Vi is not 
bounded from above. For a given objective function, we can numerically determine 
an optimal flux distribution for this problem. This formulation is also appropriate 
for the derivation of the rigorous results presented below. In the particular case of 
growth maximization, the objective vector c is taken to be parallel to the biomass 
flux, which is modeled as an effective reaction that converts metabolic species into 
biomass. 

3. Inactivity due to stoichiometric constraints. The first question of interest 
is to determine the conditions under which a reaction will be inactive for any solution 
of Eq. (1). Let us define the stoichiometric coefficient vector of reaction i to be the 
ith column of the stoichiometric matrix S. We similarly define the stoichiometric 
coefficient vector of an exchange flux. If the stoichiometric vector of reaction i can 
be written as a linear combination of the stoichiometric vector of reactions/exchange 
fluxes i\, i2, ■ ■ ■ , ife, we say that i is a linear combination of ik- We use 

this linear relationship to completely characterize the set of all reactions that are 
always inactive due to the stoichiometric constraints, regardless of any additionally 
imposed constraints, such as the availability of substrates in the medium, reaction 
irreversibility, cell maintenance requirements, and optimum growth condition. 

Theorem 1. Reaction i is inactive for all v satisfying Sv = if and only if it is 
not a linear combination of the other reactions and exchange fluxes. 

Proof. We denote the stoichiometric coefficient vectors of reactions and exchange 
fluxes by s 1; . . . , Sjv- The theorem is equivalent to saying that there exists v sat- 
isfying both Sv = and ^ if and only if s, is a linear combination of s^, 
k = l,2,...,N,k^i. 

To prove the forward direction in this statement, suppose that ^ in a state 
v satisfying Sv = 0. By writing out the components of the equation Sv = and 
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rearranging, we get 



£(-«*) 



Sjk, 



J 



/ = !,..., 



m. 



k^i 



Since vt ^ 0, we can divide this equation by Uj to see that Sj is a linear combination 
of Sfc, fc ^ i with coefficients = —Vk/vi- 

To prove the backward direction, suppose that Sj = X^fe^i c k s k- If we choose v 
so that Vk = Cfc for k ^ i and Vi = — 1 , then for each j , we have 



Theorem 1 holds true independently of other constraints because the proof docs 
not involve Eq. (2). In particular, the sufficient condition for inactivity applies 
to any nutrient medium condition and does not depend on the reversibility of the 
reactions under consideration. In the case of the reconstructed E. coli (human) 
metabolic network considered in this study (described in Sec. 7), which includes 
922 (3328) unique internal and transport reactions, a total of 141 (475) reactions 
are always inactive in steady states as a result of the condition in Theorem 1. 

4. Activity in typical steady states. The next question of interest concerns 
the number of reactions that will be active with probability one in typical meta- 
bolic states. The stoichiometric constraints Sv = define the linear subspacc 
NulS = {v G M. N | Sv = 0} (the null space of S), which contains the feasible 
solution space M. However, the set M can possibly be smaller than Nul S be- 
cause of the additional constraints arising from environmental and physiochemical 
properties (availability of substrates in the medium, reaction irreversibility, and cell 
maintenance requirements). Therefore, M may have smaller dimension than Nul S. 
If we denote the dimension of M by d, there exists a unique d-dimensional linear 
submanifold of R N that contains M, which we denote by Lm- We can then use the 
Lebesgue measure naturally defined on Lm [23] to make probabilistic statements, 
since we can define the probability of a subset A C M as the Lebesgue measure of 
A normalized by the Lebesgue measure of M. In particular, we say that m for 
almost all v G M if the set {v G M \ Vi = 0} has Lebesgue measure zero on Lm- 
An interpretation of this is that Vi ^ with probability one for an organism in a 
random state under given environmental conditions, which can be used to prove the 
following theorem. 

Theorem 2. If Vi ^ for some v G M, then Vi ^ for almost all v G M. 

Proof. Suppose that Uj ^ for some v G M. The set Li := {v G Lm | = 0} is a 
linear submanifold of Lm, so we have dimL^ < dimLju- If dimi^ = dimLjvf) then 
we have Li = Lm 2 M, implying that we have v% = for any v G M, which violates 
the assumption. Thus, we must have dimL^ < dimLM, implying that Li has zero 
Lebesgue measure on Lm- Since M C Lm, we have Mj := {v G M\vi — 0} C 
{v G Lm I Vi — 0} = Li, and thus Mj also has Lebesgue measure zero. Therefore, 
we have Vi ^ for almost all v G M. □ 

Theorem 2 implies that we can group the reactions and exchange fluxes into two 
categories: 

1. Always inactive: v% = for all v G M, and 




fe 



so v satisfies Sv = 0. 



□ 
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2. Almost always active: Vi ^ for almost all v G M. 
Consequently, the number n + (v) of active reactions satisfies 

n + (v) = nf p := n — n™ — for almost all v G M, (4) 

where n™ is the number of inactive reactions due to the stoichiometric constraints 
(characterized by Theorem 1) and is the number of additional reactions in the 
category 1 above, which are due to the environmental and irreversibility condi- 
tions. Combining this result with the finding that optimal states have fewer active 
reactions (next section), it follows that a typical state v G M is non-optimal. 

Equation (4) will lead to a different number of active reactions for different 
nutrient medium conditions [determined by Eq. (2)], with the general trend that 
this number will be larger in richer medium conditions. In the case of the E. coli 
(human) reconstructed network simulated in glucose minimal medium, as considered 
here (see Sec. 7), the number rog of inactive internal and transport reactions is 182 
(1274), of which 158 (563) are due to environmental limitations and 24 (711) are due 
to reaction irreversibility. The latter includes the cascading-induced inactivation of 
some reactions due to the inactivation of different, irreversible reactions. 



5. Activity in optimal states. We now turn to the central part of our study, 
which concerns the number of reactions that can be active in steady states that 
optimize a linear function of the metabolic fluxes. The linear programming prob- 
lem for finding the flux distribution maximizing a linear objective function can be 
written in the matrix form: 

T 

maximize: c v 

(5) 

subject to: Sv = 0, Av < b, v G R N , 

where A and b are defined as follows. If the ith constraint is Vj < j3j, the iih row 
of A consists of all zeros except for the jth entry that is 1, and bi = /3j. If the ith 
constraint is aj < Vj , the ith row of A consists of all zeros except for the jth entry 
that is —1, and bi = — aj. A constraint of the type aj < vj < fij is broken into two 
separate constraints and represented in A and b as above. The inequality between 
vectors is interpreted as inequalities between the corresponding components, so if 
the rows of A are denoted by af, aj, . . . , a^-, the inequality Av < b represents the 
set of K constraints a^v < bi, i = 1, . . . , K . By defining the feasible solution space 

M := {v G R N | Sv = 0, Av < b}, (6) 

the problem can be compactly expressed as maximizing c T v in M. 

The duality principle [4] expresses that any linear programming problem (primal 
problem) is associated with a complementary linear programming problem (dual 
problem), and the solutions of the two problems are intimately related. The dual 
problem associated with problem (5) is 

minimize: b T Ui 

subject to: A T Ui + S T u 2 = c, Ui > 0, (7) 
ui G M K , u 2 G R m , 

where {ui, u 2 } is the dual variable. A consequence of the Strong Duality Theorem 
[4] is that the primal and dual solutions are related via a well-known optimality 
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condition: v is optimal for problem (5) if and only if there exists {ui, 112} such that 

Sv = 0, Av < b, (8) 

A T Ui + S T u 2 = c, ui > 0, (9) 

(Av - b) = 0. (10) 

Note that each component of Ui can be positive or zero, and we can use this 
information to find a set of reactions that are forced to be inactive under opti- 
mization, as follows. For any given optimal solution vo, Eq. (10) is equivalent to 
uu(afvQ — hi) = 0, i = 1,. . . ,K, where uu is the ith component of ui. Thus, if 
Un > for a given i, we have afvo = bi, and we say that the constraint afv < b% 
is binding at vo- In particular, if an irreversible reaction (y% > 0) is associated with 
a positive dual variable {uu > 0), then the irreversibility constraint is binding, and 
the reaction is inactive (v{ = 0) at vo. In fact, we can say much more: we prove the 
following theorem stating that such a reaction is actually required to be inactive for 
all possible optimal solutions for a given objective function c T v. 

Theorem 3. Suppose {111,112} is a dual solution corresponding to an optimal solu- 
tion of problem (5). Then, the set M opt of all optimal solutions of problem (5) can 
be written as 

Mopt — {v 6 AI I ajv = bi for all i for which uu > 0}, (11) 

and hence every reaction associated with a positive dual component is binding for 
all optimal solutions in M opt . 

Sketch of proof . Let vo be the optimal solution associated with {111,112} and let Q 
denote the right hand side of (11). Any v G Q is an optimal solution of problem (5), 
since straightforward verification shows that it satisfies (8-10) with the same dual 
solution {111,112}. Thus, we have Q C Af opt . Conversely, suppose that v is an 
optimal solution of problem (5). Then, v can be shown to belong to H, which we 
define to be the hyperplane that is orthogonal to c and contains vo, i.e., 

H := {v e R N I c T (v - v ) = 0}. 

This, together with the fact that v satisfies Sv = and Av < b, from (8), can 
be used to show that v € Q. Therefore, any optimal solution must belong to Q. 
Putting both directions together, we have Q = M opt . □ 

As an example, consider the five- reaction network shown in Fig. 1(a) where the 
flux U4 is maximized. The problem can be written in the form of problem (5) with 



/ 1 0\ 

-1 

0-1 

0-100 

\ -1/ 



/ 1\ 

-1 




V 0/ 



and S 



1-1 0-1 
1-1 0-1 



Note that the equality constraint v± = 1 is split into two inequality constraints 
V\ > 1 and V\ < 1 for convenience and corresponds to the first two rows of A and 
b. The optimal solution space M opt consists of the single point v = (1,0,0, 1,0) T 
and a possible choice of corresponding dual solution {ui, 112} is given by 



ui = (1, 0, 1, 0, 0) T and u 2 = (-1, 0) T 
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(a) 



s 




P ^ A A 



M op t = {v e M\v 2 =0} ui = (1,0,1,0,0) T 
= {(1,0, 0,1, Of} u 2 = (-l,0) T 



(b) 



«3 + "4 




Figure 1. Simple example problem with five reactions, (a) Re- 
action network, where the flux v\ producing P is maximized, (b) 
Feasible solution space M and optimal solution space M opt in the 
projection of the flux space onto the (i>3, U4, ^-coordinates, (c) M 
and M op t m the projection of the flux space onto the (i>2, ^3, Ub)- 
coordinates. 

For this dual solution, we have Uxi = 1 > (corresponding to the constraint 
vi < 1) and U13 = 1 > (corresponding to the constraint — v 2 < 0), and Eq. (11) 
in Theorem 3 becomes 



Note that the constraint v\ = 1 can be omitted since it is satisfied by any veM 
in this example. 

Once we solve problem (5) numerically and obtain a single pair of primal and 
dual solutions (vo and {ui, 112}), we can use the characterization of M opt given in 
Eq. (11) to identify all reactions that arc required to be inactive (or active) for any 
optimal solutions. To do this we solve the following auxiliary linear optimization 
problems for each i = 1, . . . , N: 

maximize/minimize: 

subject to: Sv = 0, Av < b, af v = bi for all i for which uu > 0. 

If the maximum and minimum of Vi are both zero, then the corresponding reaction 
is required to be inactive for all v £ M opt . If the minimum is positive or maximum 
is negative, then the reaction is required to be active. Otherwise, the reaction 
may be active or inactive, depending on the choice of an optimal solution. Thus, 
we obtain the numbers n^ pt and riQ Pt of internal and transport reactions that are 
required to be active and inactive, respectively, for all v £ M opt . The number of 
active reactions for any v £ M opt is then bounded as 



M opt = {v £ M I «! = 1, v 2 = 0} = {v £ M I v 2 = 0}. 



n 



'+ P < «■+ (v) < n — nj 



opt 




(12) 
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The distribution of n+(v) within the bounds is singular: the upper bound in 
Eq. (12) is attained for almost all v £ M opt . To see this, we apply Theorem 2 
with M replaced by M opt . This is justified since we can obtain M opt from M by 
simply imposing additional equality constraints. Therefore, if we set aside the rig pt 
reactions that are required to be inactive (including n™ and n§ reactions that are 
inactive for all v £ M), all the other reactions are active for almost all v £ M opt . 
Consequently, 

n+(v) = n — riQ Pt for almost all v £ M opt . (13) 

We can also use Theorem 3 to further classify those inactive reactions caused by 
the optimization as due to two specific mechanisms: 

1. Irreversibility. The irreversibility constraint (u< > 0) on a reaction can be 
binding («j = 0) , which directly forces the reaction to be inactive for all opti- 
mal solutions. Such inactive reactions arc identified by checking the positivity 
of dual components (uu). 

2. Cascading. All other reactions that are required to be inactive for all v £ M op t 
are due to a cascade of inactivity triggered by the first mechanism, which 
propagates over the metabolic network via the stoichiometric constraints. 

These inactive reactions occur in addition to the irreversibility /cascading- induced 
reaction inactivation identified in Sec. 4 for typical (in fact all) steady states. 

In general, a given solution of problem (5) can be associated with multiple dual 
solutions. The set and the number of positive components in ui can depend on 
the choice of a dual solution, and therefore the categorization according to these 
specific mechanisms is generally not unique. For the example problem of Fig. 1, it 
is clear from M opt = {(1, 0, 0, 1, 0) T } that three reactions V2, «3, and v 5 are required 
to be inactive in the optimal state. The dual solution given above categorizes vi 
under irreversibility, and V3 and v$ under cascading. This reflects the fact that 
making V2 = under the stoichiometric constraint V2 = V3 + v$, along with the 
irreversibility constraints V3 > and v$ > 0, forces V3 = v§ = [Fig. 1(c)]. Another 
possible dual solution is given by 

ui = (1, 0, 0, 1, 1) T and u 2 = (-1, -1) T , 

which leads to an alternative characterization of the same M opt : 

M opt = {v £ M\v 3 = v 5 =0}, 

categorizing V3 and v§ under irreversibility, and V2 under cascading. One can indeed 
see graphically in the projection onto (V3, i>4, ^-coordinates in Fig. 1(b) that the 
maximization of V4 under the stoichiometric constraint V3 + v± + V5 = 1 (which 
follows from v\ = 1 and Sv = 0) forces the irreversible fluxes V3 and v§ to be 
zero, which in turn forces V2 — [Fig. 1(c)]. Clearly, one can also characterize the 
optimal solution space as 

M opt = {veM\v 2 = v 3 = v 5 = 0}, 

which corresponds for example to the choice of dual solution given by 

Ul = (1, 0, |, i, \) T and u 2 = (-1, -i) T . 

This leads to the categorization of all three inactive reactions under irreversibility. 
Thus, we can interpret the non-uniqueness of the categorization as the fact that 
different sets of triggering inactive reactions can create the same cascading effect 
on the reaction activity. 
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Figure 2. Uniqueness of optimal solutions, (a) For a typical ob- 
jective vector c, a unique optimum is obtained at a single extreme 
point, (b) In the exceptional cases where c is perpendicular to an 
edge, all points on the edge are optimal. 

The results above are important both because they can be applied to any linear 
objective function and because a significant fraction of real metabolic reactions are 
irreversible. In the case of the E. coli (human) reconstructed network, a total of 
73.4% (65.6%) of all internal and transport reactions are irreversible. Moreover, 
a number of other reactions are effectively irreversible because the irreversibility 
of different reactions in the same pathway constrains them not to run in one of 
the two directions; this leads to a total of 94.3% (74.9%) of the reactions whose 
fluxes are either necessarily nonnegative or necessarily nonpositive in all steady-state 
solutions. In the case of growth-maximizing states for the conditions considered in 
our numerical experiments, out of all 922 (3328) internal and transport reactions 
in the reconstructed network, a total of 146 (106) reactions are inactive due to 
irreversibility constraints, and a total of 114 (293) other reactions are inactive due to 
a cascade of reaction inactivation; some of the irreversible reactions can be assigned 
to either the first or the second of these two groups. The bounds provided by Eq. 
(12) depend on the objective function. In the case of growth-maximizing states, the 
lower and upper bounds are 273 (113) and 339 (1180), respectively. These numbers 
should be compared with the number 599 (1579) of reactions that are active in 
typical, suboptimal states. This clearly shows that optimal states are necessarily 
constrained to have a smaller number of active reactions, and that this is due to 
the presence of irreversible reactions in the network. 

6. Typical linear objective functions. Another problem of interest concerns 
the uniqueness of the optimal solutions. While a number of necessary and/or suf- 
ficient conditions for this uniqueness are known [28, 15], we are not aware of any 
probabilistic statements in the literature addressing this issue. Since the feasible 
solution space M is convex, its "corners" can be mathematically formulated as ex- 
treme points, defined as points v £ M that cannot be written as v = ax + by with 
a + 6= l,0<a<l and x,y £ M such that x ^ y. Intuition from the two- 
dimensional case (Fig. 2) suggests that for a typical choice of the objective vector 
c such that problem (5) has a solution, the solution is unique and located at an 
extreme point of M . We prove here that this is indeed true in general, as long as 
the objective function is bounded on M, and hence an optimal solution exists. 



10 



JOO SANG LEE, TAKASHI NISHIKAWA, ADILSON E. MOTTER 



Theorem 4. Suppose that the set of objective vectors 

B = {c G M. N c T v is bounded on M} 

has positive Lebesgue measure. Then, for almost all c in B, there is a unique 
solution of problem (5), and it is located at an extreme point of M . 

Proof. For a given c G B, the function c T v is bounded on M, so the solution 
set M opt = M op t(c) of problem (5) consists of either a single point or multiple 
points. Suppose M opt consists of a single point v and it is not an extreme point. 
By definition, it can be written as v = ax + by with o + 6= l,0<a<l and 
x, y G M such that x^y. Since v is the only solution of problem (5), x and y 
must be suboptimal, and hence we have c T x < c T v and c T y < c T v. Then, 

c T y = (c T v — ac T x)/b 
> (c T v-ac T v)/fe 

T 

and we have a contradiction with the fact that v is optimal. Therefore, if M opt 
consists of a single point , it must be an extreme point of M. 

We are left to show that the set of c G B for which M op t(c) consists of multiple 
points has Lebesgue measure zero. By Theorem 3, for a given c, there exists a set of 
indices I C {1, . . . , K} such that M opt (c) = Qj := {v G M \ afv = bi for all i G /}, 
so 

{c G R N M opt (c) contains multiple points} C |J{c G M. N | Q/ = M opt (c)}, (14) 

i 

where the union is taken over all / C {1, . . . , K} for which Qi contains multiple 
points. If c is in one of the sets in the union in Eq. (14), the set Qi, being the 
set of all optimal solutions, is orthogonal to c. Hence, c is in Qj~, the orthogonal 
complement of Qi defined as the set of all vectors orthogonal to Qj. Therefore, 

{c G 1^ | M opt (c) contains multiple points} C [JQ] 1 , (15) 

Because Qi is convex, it contains multiple points if and only if its dimension is at 
least one, implying that each Qj- in the union in Eq. (15) has dimension at most 
N — 1, and hence has zero Lebesgue measure in R N . Since there are only a finite 
number of possible choices for / C {1, . . . , K}, the right hand side of Eq. (15) is a 
finite union of sets of Lebesgue measure zero. Therefore, the left hand side also has 
Lebesgue measure zero. □ 

Note that growth rate is not a typical objective function. Because this objective 
function has nonrandom coefficients and involves only a fraction of all metabolic 
fluxes, the objective vector c is generally perpendicular to a surface limiting the 
space of feasible solutions. For this reason, the growth-maximizing states are gener- 
ally not unique. In the case of the E. coli reconstructed network simulated in glucose 
minimal medium, our numerical calculations indicate that the growth-maximizing 
solutions form a space that is 26-dimensional. In the case of the human recon- 
structed network, the corresponding dimension is 494. 
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7. Numerical experiments. Two questions follow from the results above. First, 
given the a priori surprising finding that metabolic activity as measured by the 
number of active reactions decreases in optimal states, what happens if we use other 
measures of metabolic activity such as total reaction flux in the network? Second, 
given that these results were derived for linear objective functions, to what extent 
does the observed reduction in the number of active reactions manifest itself in 
nonlinear objective functions of biological significance? These two questions are best 
examined using numerical experiments. Both are addressed below by considering 
the following objective functions: 

1. Total flux in the network: Defined as 0/ = ^2n \vi\, where i runs through all 
internal and transport reactions (excluding the biomass flux), it measures the 
overall metabolic activity while accounting for the differences in the fluxes of 
different reactions. This objective function is nonlinear because of the absolute 
value used to properly measure the flux of the reversible reactions. 

2. Total mass flow in the network: Defined as <p m = ^ i mj|^i|, where m t is 
the mass of the reactants (or products) involved in reaction i, it measures the 
overall metabolic activity weighted by the mass flow of each reaction [14]. The 
sum is over the same reaction set considered in the definition of 4>f ■ 

For other nonlinear objective functions of biological significance in cellular metab- 
olism, wc refer to Rcf. [24]. 

Figure 3 shows the results of our numerical experiments for <p f and <f> m on the E. 
coli reconstructed network. In both cases, the maximum (minimum) of the objective 
function for a given growth rate decreases (increases) as the growth rate increases, 
and it converges to essentially a single intermediate value for all states that maximize 
growth rate [Fig. 3(a,b)]. The average activity, which we determined by randomly 
sampling the solution space, is essentially constant (it decreases very slowly as 
the growth rate increases). The sampling of the solution space was performed 
using the hit-and-run method [26], which is an efficient algorithm to sample high- 
dimensional convex regions. Our implementation of this method is as described in 
our previous study [7] and involves artificial centering [12]. The observed behavior 
of the total flux activity and total mass flow activity should be contrasted with 
metabolic activity as measured in terms of the number of active reactions, which is 
significantly smaller at growth-maximizing states. 

The average number of active reactions in the 4>f -maximizing states across dif- 
ferent growth rates is just 302 out of a total of 571 that would be active in typical 
states (the latter too is an average over different growth rates, and is smaller than 
the number 599 anticipated in Sec. 5 because of additional constraints set to the 
diverging cycles throughout this section — see below). A similar result holds true 
for </> m -maximizing states (Table 1). Therefore, the maximizations of total flux and 
total mass flow in the network also lead to a reduced (rather than increased) num- 
ber of active reactions compared to typical states [such as those determined by the 
hit-and-run method in Fig. 3(a,b)]. This number varies very little with growth rate 
and is essentially undistinguishablc from the number of active reactions in growth- 
maximizing states (see standard deviations in Tabic 1). While these results concern 
E. coli, we note that similar trends are observed for the human metabolic network. 

If the irreversible reactions are made reversible [Fig. 3(c,d)], then the number 
of active reactions increases. For states that minimize </>/ and <f) m , the number 
of active reactions jumps to a large number when ctj of the irreversible reactions 
is assigned to be just slightly negative, and then decreases as the irreversibility 
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Figure 3. Metabolic activity measured in terms of total flux and 
total mass flow for the E. coli reconstructed network simulated in 
glucose minimal medium. (a,b) Maximum and minimum of the 
total flux <f>f in units of mmol/g DW-h (a) and of the total mass 
flow 4> m in units of g/g DW-h (b) (continuous lines) as a function 
of the growth rate (normalized by its maximum under the given 
conditions), where g DW denotes grams of dry weight. The dashed 
lines indicate the average in the space of feasible solutions calcu- 
lated from 5 x 10 6 randomly selected points obtained using the 
hit-and-run method. The standard deviation in estimating the av- 
erage is smaller than the size of the symbols. (c,d) Maximum and 
minimum <f)f (c) and <fi m (d) when the irreversibility constraints 
cti are relaxed from zero to —0.1 mmol/g DW-h for all irreversible 
reactions. Allowing all the reactions to be reversible leads to mod- 
erate changes in the optimal values of the objective functions, but 
it leads to drastic changes in the number of active reactions (Table 
1). 



constraints are further relaxed (Table 1). For states that maximize </>/ and <p m , the 
increase in the number of active reactions is by a factor of nearly 2 (Table f ). This 
number is comparable to the number of active reactions in typical suboptimal states 
of the original network. Therefore, like in the case of linear objective functions, the 
reduced number of active reactions found in states that maximize the total flux or 
the total mass flow is due to the presence of irreversible reactions in the metabolic 
network. 

All simulations presented in this paper are based on a reconstructed metabolic 
network of E. coli K-12, which represents a further curated version of the UR904 
model [22] in which duplicated reactions have been removed, and on the most 
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Table 1 . Number of active reactions in states maximizing or min- 
imizing the total flux, and the total mass flow, <p m . The relax- 
ation of the irreversibility constraints is implemented by allowing 
a,; to be negative for all irreversible reactions, as indicated in the 
leftmost column. Each column shows the average and standard 
deviation calculated over the growth rates considered in Fig. 3. 





max (f> j 


min 4>f 


max m 


min 4> m 


actual irreversibility 


301.9 (2.6) 


292.3 (2.7) 


300.4 (2.7) 


292.1 (2.6) 


irreversibility relaxed to —1Q~' > 
irreversibility relaxed to — 10~" 
irreversibility relaxed to — 10 _1 
irreversibility relaxed to —1 


580.5 (8.6) 
592.7 (4.5) 
595.4 (4.2) 
595.0 (4.0) 


506.8 (14.5) 
502.2 (17.8) 
469.4 (38.0) 
370.1 (35.7) 


593.2 (8.1) 
597.9 (2.5) 

596.6 (2.2) 

596.7 (3.5) 


501.1 (17.4) 
502.1 (17.7) 
475.8 (32.0) 
404.0 (41.2) 



complete reconstructed human metabolic network, generated by applying the same 
curation to the Homo sapiens Recon 1 model [8] . The E. coli (human) network used 
consists of 922 (3328) reactions, 901 (1491) enzyme- and transport protein-coding 
genes, 618 (2766) metabolites, 143 (404) exchange fluxes, and the biomass flux. 
For the E. coli network, the simulated medium had limited amount of glucose (10 
mmol/g DW-h) and oxygen (20 mmol/g DW-h), and unlimited amount of sodium, 
potassium, carbon dioxide, iron (II), protons, water, ammonia, phosphate, and 
sulfate; the flux through the ATP maintenance reaction was set to 7.6 mmol/g 
DW-h. For the human network, we used a medium with limited amount of glucose 
(1 mmol/g DW-h) and unlimited amount of oxygen, sodium, potassium, calcium, 
iron (II and III), protons, water, ammonia, chlorine, phosphate, and sulfate; for 
the biomass composition, we followed Ref. [25]. In our simulations, 10~ 6 mmol/g 
DW-h was used as a flux threshold to define the set of reactions considered active. 
A few cycles whose flux or mass flow would diverge in the optimization of the 
corresponding objective function were assigned the minimum feasible flux in the 
optimization of (f>f and the minimum feasible mass flow in the optimization of 
4> m (under the constraint of not altering the fluxes of the other reactions). These 
minimum flux values were also adopted as bounds in our hit-and-run sampling. All 
numerical calculations were implemented using the COBRA Toolbox [3] and the 
CPLEX optimization software [11]. 
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